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ABSTRACT 

Over the last year the J. Paul Getty Trust's Web presence has evolved 
from a group of disparate, independently maintained Web sites into a homogeneous 
consistently branded one. This transformation recently culminated with the 
implementation of a leading Content Management System (CMS) . There were and are many 
process-changes and challenges in implementing a CMS in an institution such as the 
Getty. These issues are not unique and span the gamut from social, to business, to 
technical. This paper will highlight the major issues, describe the route the Getty 
took, and give an insight into the functionality and capability of a leading CMS 
application for a content-rich museum Web site. Topics covered include: the 
maintenance burden of re-launching the Web site before implementing a CMS; preparing 
for a CMS; CMS requirements; implementation; environment; templating; the templating 
process; workflow; deployment; the rollout; and future initiatives. (Author/ MES) 
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Abstract 
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Over the last year the J. Paul Getty Trust’s Web presence has 
evolved from a group of disparate, independently maintained Web 
sites into a homogenous consistently branded one - getty.edu. 

This transformation recently culminated with the implementation of 
a leading Content Management System (CMS). There were and 
are many process-changes and challenges in implementing a 
CMS in an institution such as the Getty. These issues are not 
unique and span the gamut from social, to business, to technical. 
This paper will highlight the major issues, describe the route the 
Getty took, and give an insight into the functionality and capability 
of a leading CMS application for a content-rich museum Web site. 



Keywords: Content Management, Templates, Workflow, 
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The Getty is a campus of six programs: Museum, Research Institute, 
Conservation Institute, Trust Administration, Grant Program and 
Leadership Institute. All these programs contribute content for the Web 
site, resulting in a very content-rich environment. It currently includes 
details on 4,000 works of art; 1,500 artist biographies; 3 hours of 
streaming video; past and present exhibitions; art historical research 
papers and tools; conservation research papers and tools; on-line library 
research tools; detailed visitor information; an event calendar, and a 
parking reservation system. 

Up until a year ago, each program independently maintained its own 
portion of getty.edu. A gateway homepage was placed at the top level, 
but as soon as one drilled down into the Website, it was immediately 
apparent that there was no consistent design, navigation, or look and 
feel, and no way to navigate around the site without going back to the 
homepage. 

A trustee-level decision was made to present the Web site with a 
consistent design and look and feel - a Web group was formed and 
charged with this task. Accomplishing this task would include the 
implementation of a Content Management System (CMS) to ensure that 
all the programs could continue to contribute content to the Web site but 
in a managed and decentralized way. Rather than implement the 
redesign and CMS simultaneously, the redesign was implemented first, 
but significant effort was assigned to preparing the HTML pages for later 
CMS integration. 
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The Web site was re-launched in February, 2001, with a unified design 
and functionality and a more thematic treatment of the content. The task 
had required that the webgroup temporarily ‘own’ all the content and the 
processes associated with getting that content to the Web. The 
knowledge gained during this process was crucial in being able to identify 
the real requirements of any CMS that the Getty would need to 
implement. 

Maintenance Burden 

There was a significant burden to re-launching the Web site before 
implementing a CMS. To guarantee that ongoing development and 
maintenance on the Web site adhered to the new rules and style guide, 
all content still had to funnel through the webgroup - the new design had 
committed us to a frequent subsite update regime. The GartnerGroup, 
whom the Getty use to benchmark themselves against similar 
organizations and business processes, publish a graph which exactly 
summed up where we stood in the Web life-cycle. Even with a webgroup 
of 18 people, there were few resources to develop new content. 



Web Site Life Cycle Costs 



Costs 




Non-W«b Content 
Management 



Web Content 
Management 



Classic AD Cycte 



Time 

Fig 1: Website Life Cycle Costs. GartnerGroup © 1999 



Preparing For a CMS 

To prepare the Getty community for a CMS, which might take 6-9 months 
to select, the webgroup temporarily established its own managed 
environment to create and deploy content to coincide with the launch of 
the redesign. On the infrastructure side, we implemented a three-tier 
staging environment (see figure 2) and a custom-written application to 
‘push’ content between tiers. This was a fairly high-level tool operating on 
directories rather than individual files. On the business side, we 
established a strict push schedule, which meant that content could only 
be moved into production twice a week - although this could be fast- 
tracked for emergencies on a case by case basis. 
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Fig 2: Three-tier Staging Environment 



Initially there was great resistance from the programs to this regime. The 
‘anarchic’ nature of content generation and deployment that had 
established itself at the Getty meant that this was a considerable culture 
shock. The programs perceived a ‘loss of control’ of their content, and 
objected to the delay between pushes. Fortunately, the reality was 
different: the schedule of pushing every other day allowed for only one 
day to create and review content before it went live. The programs soon 
fell into the regime and altered their business processes to accommodate 
the schedule. With the CMS implementation we could afford to increase 
the push schedule to whatever was appropriate - since it would be 
automatic. But the temporary situation was a good proving ground for a 
professionally managed environment. 



CMS Requirements 

Armed with the detailed knowledge of content generation and processes 
around the Getty gained during the redesign, the Web group drew up a 
requirements document for a CMS. Our ‘big ticket’ items are probably 
consistent with many museums’ CMS wish list, and these were: 

• Template Driven: To guarantee consistent visual design, to 
separate content from design and to allow multiple content use. 

• Used by HTML Illiterate: To allow an interface that lets non-HTML 
contributors submit and review web content. 

• Workflow: The processes established at the Getty to get content to 
the Web site range from a simple two-step workflow to complex 
multi-program, concurrent-task processes. 

• File Versioning: The ability to track individual file changes, what 
and by whom. 

• Rollback: The ability to publish an edition of the Web site on any 
previous date. 

• Open/Standard Architecture: The Getty's Web infrastructure is 
based on Unix servers, iPlanet Web servers, Java/perl CGI 
applications and Oracle databases, all sitting on a Novell network. 
Any Web application has to live in this environment. 

• Virtualisation: The ability to see any changes or new content in the 
context of the entire Web site - before it is deployed. 

• Multiple Web site Management: For getty.edu and for our Intranet 
GO (Getty Online). 
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• Cross-platform clients: The Getty has Mac and PC-based 
contributors. 

• Multi-channel Deployment: Eg. XML and WML delivery. 

• In-house Maintenance and Development: Not to have to go back 
to the vendor. 

• Vendor Stability: An important consideration given the current 
economic times. 

After tracking the CMS market during the re-design, the Web group 
focused on three CMS products after the usual RFI (Request For 
Information) and RFP (Request For Proposal) processes. These were: 
Stellent's Expedio (http://www.stellent.com), Vignette's StoryServer 
(http://www.vignette.com) and Interwoven's Teamsite 
(http://www.interwoven.com). After a bake-off, the Getty selected 
Interwoven as its preferred CMS vendor. 

Implementation 

Teamsite is a Web-application suite based on C++, java, perl/cgi and 
XML technologies. It is a client/server environment with all users 
interacting with it via a browser. After some testing we established that 
Internet Explorer was more compatible. The only exception to the 
browser interface is the module to create workflows - this is a Windows 
application. Teamsite offers fully integrated Templating, a very flexible 
and fully integrated Workflow engine, full Virtualisation and integration 
with a broad range of Web content creation applications such as 
Dreamweaver and Homesite. It also stands as a solid platform on which 
to further develop our Web site, by integrating with an array of 
application, personalization and syndication servers - which are long- 
term goals for the Web site. 

Interwoven's terms and conditions of sale include an agreed method of 
installation and configuration. This means you need to buy their 
professional services or contract with a registered Interwoven partner - a 
so- called ‘enabler’, of which there are many and for which there is a 
'corkage' fee to Interwoven. After reading a number of 'dot com' elegies, 
we were keen not to have any one of these ‘enablers' come in, install the 
product, then hold us to ransom over professional services to develop 
our site. Also, our Web group has a strong technical component and 
should be more than capable of maintaining the environment. The best 
integration solution for us was Interwoven's so-called ‘Fast Forward Core 
Pack’ - a 30-day engagement with a Teamsite consultant and project 
manager. At the end of this engagement, the client is ‘guaranteed’ to 
have a correctly installed and configured product, at least one site 
deployed through Teamsite as a file-managed process, and up to two 
templates and two workflows in place. 

We scheduled technical training to coincide with the arrival of the 
implementation consultant so that we could speak the Teamsite 
language and could assist in the installation and configuration. We 
installed the suite onto our Unix staging server, since once operational, 
we eliminated the need for a separate staging environment. 

Environment 

Teamsite uses a server's native file structure to manage and deploy 
content, and manages separate Web sites as branches. Because of this 
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approach, the default view of a Web site within Teamsite is as a file 
system like Windows Explorer: 
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Fig 3: Teamsite's Default View of a Website 



Each registered user has an account based on the server's native 
authorization technology. In our case this is LDAP. Within Teamsite, 
each user is assigned at least one workarea and has a virtual view of the 
whole site. This workarea corresponds to the section of the Web site on 
which they work; they only have permission to modify documents within 
their own workarea. Every user is also assigned a role depending on 
contribution level. This determines the authority they have within the 
environment: author, editor, administrator or master: 



• Author - the primary content creator: he owns content; can create 
and edit files; can receive assignments through the workflow 
engine and has work approved by his workarea owner. 

• Editor - creates content and oversees content creation: he owns a 
workarea; can create, edit and delete files; approves or rejects 
work of the authors; can submit files to the staging area and can 
publish editions. 

• Administrator - manages an area or branch (Web site), of which 
he is the owner and can create and delete workareas. 

• Master - has absolute power and fundamental administrative 
control over the product - a role reserved for a Unix System 
Administrator and a person to be chosen wisely. 
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Production 




Fig 4: Schematic of the Teamsite Environment and Associated 

Roles 



Working within the workarea, a user first synchronises with the latest 
version of content by performing a get latest function within the staging 
area. When the appropriate edits have been made and checked within 
the virtualized view of the entire site, the content is submitted back to the 
staging area. Browsing in the virtualized view of the Web site is no 
different from browsing the site under normal circumstances. The 
staging area is where all the content from different workareas is 
integrated and tested. It is a read-only environment where nothing can be 
edited. According to the publishing schedule, a snapshot is made of the 
staging area creating an edition, which can then be deployed to the 
production server. 

We were looking to manage two Web sites with Teamsite: getty.edu and 
our Intranet, GO. After a week of analysis we began planning two 
development environments, each of these becoming a branch within 
Teamsite. We then ’sucked in’ each site - a tar’ing, FTP and untar’ing 
process - which took about a day each. So, within two week’ of the 
consultant’s arrival, we had set up a pre-production environment for both 
sites. With these environments in place, we scheduled some intensive 
training on the core Teamsite functions of Templating and Workflow - 
unfortunately, this required a lengthy field trip to Interwoven’s training 
facilities in San Francisco. 

Templating 

Templating is a core function of a CMS. It allows the separation of 
content from design, the automatic generation of an HTML page from 
content, guarantees consistent visual design, and allows Web pages to 
be generated by HTML-illiterate contributors. A fully templated site is one 
significant goal of full implementation. Teamsite’s templating system is 
based on XML and requires a thorough understanding to develop 
templates. We began by analyzing our Web site to identify a list of initial 
template candidates. We started with the broad assumption on the 
cost/benefit of converting pages under the homepage. This indicated that 
the greatest cost/benefit was to be had at the lowest levels, where many 
pages would be served with a single template. For example, the museum 
collection subsite consists of approximately 18,000 pages,. Analysis of 
the design and layout indicated that we could probably serve this with 
fewer than five templates, possibly even one. 
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Homepage 




Fig 5: Identifying Template Candidates 
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We then applied a number of qualifications: Do a number of people 
contribute content to this page? Does the content on this page change 
often? Additionally, we wanted the first template candidates to also 
establish precedents for resolving issues that would globally impact the 
site. We focused on two candidates: the Job Postings, which have a lot 
of similar pages and a daily turnover, and the News Articles, which have 
a weekly turnover but a significant number of contributors (and 
consequently a complex associated workflow). 



Templating Process 



Templating is a three-step process of data capture, storage and 
presentation, figure 6 shows a schematic of this process. 



Save 



Data Capture 
Template 



□ 



Generate/Regenerate 

► X i— 

Presentation Template ^4 
(XML) 



Data Content 
Record (XML) + 



Presentation Template 
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Fig 6: Templating Overview 



Preview 



When a user wants to submit content using a template, they first fill out a 
data capture template (DCT). This is in an XML configuration file, 
created by a developer with a good understanding of XML and some 
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basic programming skills. Submitting a completed DCT saves a data 
content record (DCR) which is either saved as a file or to a database. 
The last element is a presentation template (PT), another XML file with 
a variety of markup flavors: ASP, JSP, WML or HTML and also 
embedded perl callouts and conditional programming tags. The PT is 
essentially created from an existing HTML page by giving it an XML 
‘wrapper’ and substituting the content areas with appropriate XML tags 
that are defined in the DCR. Figure 7 shows code fragments tracing a 
single piece of content, a job title, from capture to presentation: 
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Fig 7: Tracing a singie piece of content 



When the presentation engine combines the DCR and the PT it executes 
any perl and any conditional tags to generate the final page. The 
jurisdiction of the different roles in the templating process can be 
summarized as: 




Fig 8: The Jurisdiction of Roies in the Tempiating Process 



The templating architecture makes for a very powerful and flexible 
process and allows for single presentation templates to account for a 
wide range of related Web pages. Much of the template process can be 
automated; for example, workflows can be invoked at the DCR commit 
stage, or the final Web page can be generated automatically when a 
DCR is saved. It is wise not to underestimate the amount of work 
required to convert a Web site into templates. 
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Workflow 



Workflow is another core function of a CMS - it defines the content 
creation and approval processes. The workflow module in Teamsite was 
one of the most comprehensive and flexible we reviewed during the 
selection process. The biggest challenge we found in implementing 
workflows within the Getty is the analysis of the 'business' processes, 
which continues to be a challenge. This analysis is a time-consuming 
practice in an environment where these processes have grown 
organically over time. When interviewing staff around the Getty, we often 
found ourselves giving advice on consolidating their processes, since no 
one had thought to review what was happening. Often when we 
flowcharted the program's processes, staff were surprised at how 
redundant their activities were. The concise steps to analyzing a process 
that we use are: 

1. State the general requirement for the process 

2. Identify the tasks that need to be performed 

3. Identify who needs to perform the tasks 

4. Order the tasks in which they need to be performed 

5. Review with the process requirement 

Having generated and signed off a flow chart representing the workflow 
process, the next step is to implement the workflow. WorkflowBuilder is 
a drag-and-drop flowcharting application based heavily on the Visio 
interface. 




Fig 9: An Example Workflow in WorkflowBuilder 



Figure 9 shows a typical workflow of six tasks. When invoked within 
Teamsite, the instance of this workflow is termed a job. It has a variety of 
variables associated with it, which are defined when the job starts. For 
example, one variable might be the name of a file that needs updating or 
the name of a directory where a new page needs to be created. The first 
task in figure 9 is a content-creation task which will be assigned to a user 
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using the owner variable. When the content has been written and 
submitted, the workflow sends an e-mail to the appropriate approver 
with review instructions and a hyperlink to the relevant content. A 
rejection generates an e-mail back to the owner, but an approval submits 
the content which might invoke the automatic generation of a Web page 
ready for deployment. This workflow is very generic. The use of variables 
allows for many instances of a single workflow - the variables simply 
travel along with a particular workflow instance. The power of the 
workflow engine can be extended immensely by additional tasks such as 
the CGI script invocation and an external task which runs any external 
application or script on the host machine. 

Deployment 

Deployment of content is the final step in the content generation process. 
It is the sending and receiving of files from the base server to the receiver 
host. Any number of deployment scenarios can be created, based on 
three basic themes: Deploy files in a specified list - the list can be 
generated programmatically; Deploy files in a directory by file comparison 
between the source and the target; Deploy directories by directory 
comparison between the source and the target. The file transfer executes 
in a transactional mode, meaning that content will not be replaced on the 
production server until all the requested files have been received. 



Firewall 



Base server 




Deplby 



teamsite.getty.edu 



Receiver host 
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Fig 10: Deployment 



The Rollout 

One important factor in selecting this product was its ability to manage a 
Web site at a file level. This allowed us to implement Teamsite without 
disturbing the contributors around the Getty using their Dreamweaver 
and Homesite applications. We simply continued to import their files from 
the development server into Teamsite until we were ready to schedule 
their training and register them as users. The go-live date was an 
uneventful day. One day we were 'pushing,' and the next day we were 
'deploying'.- This fact alone makes it a resounding success. The 
process of trawling through the Website leaving templated content in our 
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wake is ongoing, to be completed around June 2002. 



Future Initiatives 

We selected Teamsite with a long-term web strategy in mind. We chose 
this particular product because we require a platform that will foster the 
best opportunities to meet our long term goals and also one that is 
flexible enough to integrate with presently unknown program initiatives 
and future www trends. There are two immediate initiatives that are of 
note: 

Vocabulary-Assisted Metatagging - There is a significant metadata 
initiative at the Getty and extensive research into vocabulary-assisted 
searching. Currently, searching on getty.edu results in an 'invisible' 
expansion against our ULAN (United List of Artist Names) vocabulary. By 
way of an example, if one searches for Carrucci on our Web site, 15 hits 
are returned to Pontormo but no mention of Carrucci - because Carrucci 
was the birth name of Pontormo as defined in ULAN. While we will 
continue to expand search queries against our vocabularies (Art & 
Architecture Thesaurus and the Thesaurus of Geographic Names 
are planned shortly), we plan to integrate more fully the metatagging of 
content at the creation stage. As luck would have it, Teamsite has a 
module called Metatagger, which we have already begun to investigate. 

Personalisation - The 'next big thing' for our Web site is a venture into 
the world of Personalization. Again, as luck would have it, Interwoven is 
partnered with a variety of personalization server providers. At the time of 
writing, we are in the selection process, so stay tuned. 

Conclusion 

Teamsite is an expensive solution to enterprise-wide content 
management that the Getty is in the fortunate position to be able to 
afford. Moreover, our Web strategy is such that we require an 'industrial 
grade’ CMS such as this to achieve our goals. The implementation 
involved a significant amount of planning, strategizing and work plus a 
high level of technical skill and support. The ongoing maintenance and 
development of the site through Teamsite has greatly increased the skill 
level of the Web group and shifted the typical requirements of a 
‘manually’ maintained Website. In some areas it has polarized the 
webgroup resources, requiring more technical people to work on the 
‘back end’ but fewer technical and fewer Web-literate people on the 'front 
end’. The executive-level expectation of a CMS is often a reduction in 
resources required for Web site development. To a certain extent this is 
true; however, those reduced resources are at a higher technical level 
that may offset any perceived budgetary reductions. For us, the 
webgroup is just as busy after the implementation. The difference is that 
we’re accomplishing more. 




12 



5/22/2003 




U.S. Department of Education 
Office of Educational Research and Improvement (OERI) 
National Library of Education (NLE) 
Educational Resources Information Center (ERIC) 




NOTICE 



Reproduction Basis 




This document is covered by a signed "Reproduction Release (Blanket)" 
form (on file within the ERIC system), encompassing all or classes of 
documents from its source organization and, therefore, does not require a 
"Specific Document" Release form. 



This document is Federally-funded, or carries its own permission to 
reproduce, or is otherwise in the public domain and, therefore, may be 
reproduced by ERIC without a signed Reproduction Release form (either 
"Specific Document" or "Blanket"). 



o 

ERIC EFF-089 (1/2003) 



